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Dear Sir: 




A subset of WAVES database containing hands-free recordings was used, which consists of two 
recording sessions: parked (car parked, engine off) and city-driving (car driven on a stop and go basis). 



REMARKS 



The changes made on page 8 are illustrated in the attached "Version with markings to show changes 
made". 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 
IN THE SPECIFICATION: 
Page 8, lines 17-19 

A subset of WAVES database containing hands-free recordings was used, which consists 
of [three] two recording sessions: [parked-tm (car parked, engine off),] parked (car parked, 
engine off)[,] and city-driving (car driven on a stop and go basis). 



2. Use eqixal probabilities for P3 \P(j\p) andP^cL. j(k\hj). 



Pj\H(j\p) = ^ (10) 

Px\H,j(k\pJ) = B 

3. In fact, the case described in Eq-10 consists in averaging the compensated 
mean vectors mpj^j,^ Referring to Eq-4 and Eq-1, it can be expected that the averaging reduces 
the speech part mpj;;t just as CMN does. Therefore, Eq-7 could be fixrther simphfied into: 

b ^IDFT{pFTO})®DFT{X)). (11) 

The model m^ ^^^^ is then used with CMN on noisy speech. Unfortunately, b is a function of 
both channel and background noise in all above cases. In other words, in presence of noise, there 
is no guarantee that the channel will be removed by such a vector, as is for CMN. 

A subset of WAVES database containing hands-free recordings was used, which consists 
of two recording sessions: parked (car parked, engine off) and city-driving (car driven on a stop 
and go basis). 

In each session, 20 speakers (10 male) read 40 sentences each, giving 800 utterances. 
Each sentence is either 10, 7 or 4 digit sequence, with equal probabihties. The database is 
sampled at 8kHz, with MFCC analysis frame rate of 20ms. Feature vector consists of 10 statis 
and 10 dynamic coefficients. 

HMMs used in all experiments are trained in TIDIGITS clean speech data. Utterance- 
based cepstral mean normaUzation is used. The HMMs contain 1957 mean vectors, and 270 
diagonal variances. Evaluated on TIGIDIT test set, the recognizer gives 0.36% word error rate. 
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